98 research outputs found
Robust and Scalable Hyperdimensional Computing With Brain-Like Neural Adaptations
The Internet of Things (IoT) has facilitated many applications utilizing
edge-based machine learning (ML) methods to analyze locally collected data.
Unfortunately, popular ML algorithms often require intensive computations
beyond the capabilities of today's IoT devices. Brain-inspired hyperdimensional
computing (HDC) has been introduced to address this issue. However, existing
HDCs use static encoders, requiring extremely high dimensionality and hundreds
of training iterations to achieve reasonable accuracy. This results in a huge
efficiency loss, severely impeding the application of HDCs in IoT systems. We
observed that a main cause is that the encoding module of existing HDCs lacks
the capability to utilize and adapt to information learned during training. In
contrast, neurons in human brains dynamically regenerate all the time and
provide more useful functionalities when learning new information. While the
goal of HDC is to exploit the high-dimensionality of randomly generated base
hypervectors to represent the information as a pattern of neural activity, it
remains challenging for existing HDCs to support a similar behavior as brain
neural regeneration. In this work, we present dynamic HDC learning frameworks
that identify and regenerate undesired dimensions to provide adequate accuracy
with significantly lowered dimensionalities, thereby accelerating both the
training and inference.Comment: arXiv admin note: substantial text overlap with arXiv:2304.0550
Runtime Adaptive System-on-Chip Communication Architecture
The adaptive system provides adaptivity both
in the system-level and in the architecture-level. The system-level adaptation is provided
using a runtime application mapping. The architecture-level adaptation is implemented by using
several novel methodologies to increase the resource utilization of the underlying silicon
fabric, i.e. sharing the Virtual Channel Buffers among different output ports. To achieve successful runtime adaptation, a runtime observability infrastructure is included
BayesImposter: Bayesian Estimation Based .bss Imposter Attack on Industrial Control Systems
Over the last six years, several papers used memory deduplication to trigger
various security issues, such as leaking heap-address and causing bit-flip in
the physical memory. The most essential requirement for successful memory
deduplication is to provide identical copies of a physical page. Recent works
use a brute-force approach to create identical copies of a physical page that
is an inaccurate and time-consuming primitive from the attacker's perspective.
Our work begins to fill this gap by providing a domain-specific structured
way to duplicate a physical page in cloud settings in the context of industrial
control systems (ICSs). Here, we show a new attack primitive -
\textit{BayesImposter}, which points out that the attacker can duplicate the
.bss section of the target control DLL file of cloud protocols using the
\textit{Bayesian estimation} technique. Our approach results in less memory
(i.e., 4 KB compared to GB) and time (i.e., 13 minutes compared to hours)
compared to the brute-force approach used in recent works. We point out that
ICSs can be expressed as state-space models; hence, the \textit{Bayesian
estimation} is an ideal choice to be combined with memory deduplication for a
successful attack in cloud settings. To demonstrate the strength of
\textit{BayesImposter}, we create a real-world automation platform using a
scaled-down automated high-bay warehouse and industrial-grade SIMATIC S7-1500
PLC from Siemens as a target ICS. We demonstrate that \textit{BayesImposter}
can predictively inject false commands into the PLC that can cause possible
equipment damage with machine failure in the target ICS. Moreover, we show that
\textit{BayesImposter} is capable of adversarial control over the target ICS
resulting in severe consequences, such as killing a person but making it looks
like an accident. Therefore, we also provide countermeasures to prevent the
attack
Rampo: A CEGAR-based Integration of Binary Code Analysis and System Falsification for Cyber-Kinetic Vulnerability Detection
This paper presents a novel tool, named Rampo, that can perform binary code
analysis to identify cyber kinetic vulnerabilities in CPS. The tool takes as
input a Signal Temporal Logic (STL) formula that describes the kinetic effect,
i.e., the behavior of the physical system, that one wants to avoid. The tool
then searches the possible cyber trajectories in the binary code that may lead
to such physical behavior. This search integrates binary code analysis tools
and hybrid systems falsification tools using a Counter-Example Guided
Abstraction Refinement (CEGAR) approach. Rampo starts by analyzing the binary
code to extract symbolic constraints that represent the different paths in the
code. These symbolic constraints are then passed to a Satisfiability Modulo
Theories (SMT) solver to extract the range of control signals that can be
produced by each path in the code. The next step is to search over possible
physical trajectories using a hybrid systems falsification tool that adheres to
the behavior of the cyber paths and yet leads to violations of the STL formula.
Since the number of cyber paths that need to be explored increases
exponentially with the length of physical trajectories, we iteratively perform
refinement of the cyber path constraints based on the previous falsification
result and traverse the abstract path tree obtained from the control program to
explore the search space of the system. To illustrate the practical utility of
binary code analysis in identifying cyber kinetic vulnerabilities, we present
case studies from diverse CPS domains, showcasing how they can be discovered in
their control programs. Our tool could compute the same number of
vulnerabilities while leading to a speedup that ranges from 3x to 98x
Inter-Layer Scheduling Space Exploration for Multi-model Inference on Heterogeneous Chiplets
To address increasing compute demand from recent multi-model workloads with
heavy models like large language models, we propose to deploy heterogeneous
chiplet-based multi-chip module (MCM)-based accelerators. We develop an
advanced scheduling framework for heterogeneous MCM accelerators that
comprehensively consider complex heterogeneity and inter-chiplet pipelining.
Our experiments using our framework on GPT-2 and ResNet-50 models on a
4-chiplet system have shown upto 2.2x and 1.9x increase in throughput and
energy efficiency, compared to a monolithic accelerator with an optimized
output-stationary dataflow.Comment: Accepted poster abstract to the IBM IEEE AI Compute Symposium
(AICS'23
Romanus: Robust Task Offloading in Modular Multi-Sensor Autonomous Driving Systems
Due to the high performance and safety requirements of self-driving
applications, the complexity of modern autonomous driving systems (ADS) has
been growing, instigating the need for more sophisticated hardware which could
add to the energy footprint of the ADS platform. Addressing this, edge
computing is poised to encompass self-driving applications, enabling the
compute-intensive autonomy-related tasks to be offloaded for processing at
compute-capable edge servers. Nonetheless, the intricate hardware architecture
of ADS platforms, in addition to the stringent robustness demands, set forth
complications for task offloading which are unique to autonomous driving.
Hence, we present , a methodology for robust and efficient task
offloading for modular ADS platforms with multi-sensor processing pipelines.
Our methodology entails two phases: (i) the introduction of efficient
offloading points along the execution path of the involved deep learning
models, and (ii) the implementation of a runtime solution based on Deep
Reinforcement Learning to adapt the operating mode according to variations in
the perceived road scene complexity, network connectivity, and server load.
Experiments on the object detection use case demonstrated that our approach is
14.99% more energy-efficient than pure local execution while achieving a 77.06%
reduction in risky behavior from a robust-agnostic offloading baseline.Comment: This paper has been accepted to the 2022 International Conference On
Computer-Aided Design (ICCAD 2022
DOMINO: Domain-invariant Hyperdimensional Classification for Multi-Sensor Time Series Data
With the rapid evolution of the Internet of Things, many real-world
applications utilize heterogeneously connected sensors to capture time-series
information. Edge-based machine learning (ML) methodologies are often employed
to analyze locally collected data. However, a fundamental issue across
data-driven ML approaches is distribution shift. It occurs when a model is
deployed on a data distribution different from what it was trained on, and can
substantially degrade model performance. Additionally, increasingly
sophisticated deep neural networks (DNNs) have been proposed to capture spatial
and temporal dependencies in multi-sensor time series data, requiring intensive
computational resources beyond the capacity of today's edge devices. While
brain-inspired hyperdimensional computing (HDC) has been introduced as a
lightweight solution for edge-based learning, existing HDCs are also vulnerable
to the distribution shift challenge. In this paper, we propose DOMINO, a novel
HDC learning framework addressing the distribution shift problem in noisy
multi-sensor time-series data. DOMINO leverages efficient and parallel matrix
operations on high-dimensional space to dynamically identify and filter out
domain-variant dimensions. Our evaluation on a wide range of multi-sensor time
series classification tasks shows that DOMINO achieves on average 2.04% higher
accuracy than state-of-the-art (SOTA) DNN-based domain generalization
techniques, and delivers 16.34x faster training and 2.89x faster inference.
More importantly, DOMINO performs notably better when learning from partially
labeled and highly imbalanced data, providing 10.93x higher robustness against
hardware noises than SOTA DNNs
ERUDITE: Human-in-the-Loop IoT for an Adaptive Personalized Learning System
Thanks to the rapid growth in wearable technologies and recent advancement in
machine learning and signal processing, monitoring complex human contexts
becomes feasible, paving the way to develop human-in-the-loop IoT systems that
naturally evolve to adapt to the human and environment state autonomously.
Nevertheless, a central challenge in designing many of these IoT systems arises
from the requirement to infer the human mental state, such as intention,
stress, cognition load, or learning ability. While different human contexts can
be inferred from the fusion of different sensor modalities that can correlate
to a particular mental state, the human brain provides a richer sensor modality
that gives us more insights into the required human context. This paper
proposes ERUDITE, a human-in-the-loop IoT system for the learning environment
that exploits recent wearable neurotechnology to decode brain signals. Through
insights from concept learning theory, ERUDITE can infer the human state of
learning and understand when human learning increases or declines. By
quantifying human learning as an input sensory signal, ERUDITE can provide
adequate personalized feedback to humans in a learning environment to enhance
their learning experience. ERUDITE is evaluated across participants and
showed that by using the brain signals as a sensor modality to infer the human
learning state and providing personalized adaptation to the learning
environment, the participants' learning performance increased on average by
. Furthermore, we showed that ERUDITE can be deployed on an edge-based
prototype to evaluate its practicality and scalability.Comment: It is under review in the IEEE IoT journa
Golden Reference-Free Hardware Trojan Localization using Graph Convolutional Network
The globalization of the Integrated Circuit (IC) supply chain has moved most
of the design, fabrication, and testing process from a single trusted entity to
various untrusted third-party entities worldwide. The risk of using untrusted
third-Party Intellectual Property (3PIP) is the possibility for adversaries to
insert malicious modifications known as Hardware Trojans (HTs). These HTs can
compromise the integrity, deteriorate the performance, deny the service, and
alter the functionality of the design. While numerous HT detection methods have
been proposed in the literature, the crucial task of HT localization is
overlooked. Moreover, a few existing HT localization methods have several
weaknesses: reliance on a golden reference, inability to generalize for all
types of HT, lack of scalability, low localization resolution, and manual
feature engineering/property definition. To overcome their shortcomings, we
propose a novel, golden reference-free HT localization method at the
pre-silicon stage by leveraging Graph Convolutional Network (GCN). In this
work, we convert the circuit design to its intrinsic data structure, graph and
extract the node attributes. Afterward, the graph convolution performs
automatic feature extraction for nodes to classify the nodes as Trojan or
benign. Our automated approach does not burden the designer with manual code
review. It locates the Trojan signals with 99.6% accuracy, 93.1% F1-score, and
a false-positive rate below 0.009%.Comment: IEEE Transactions on Very Large Scale Integration Systems (TVLSI),
202
- …